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Abstract 

The present article serves as an erratum to our paper of the same title, which was presented and 
published in the KDD 2014 conference. In that article, we claimed falsely that the objective function 
defined in Section [l.4| is non-monotone submodular. We are deeply indebted to Debmalya Mandal, Jean 
Pouget-Abadie and Yaron Singer for bringing to our attention a counter-example to that claim. 

Subsequent to becoming aware of the counter-example, we have shown that the objective function is 
in fact NP-hard to approximate to within a factor of 0(n 1 ~ e ) for any e > 0. 

In an attempt to fix the record, the present article combines the problem motivation, models, and 
experimental results sections from the original incorrect article with the new hardness result. We would 
like readers to only cite and use this version (which will remain an unpublished note) instead of the 
incorrect conference version. 


‘Department of Computer Science, University of Southern California; xinranhe@usc.edu 
' Department of Computer Science, University of Southern California; dkempe@usc.edu 


1 



1 Introduction 


The processes and dynamics by which information and behaviors spread through social networks have long 
interested scientists within many areas. Understanding such processes has the potential to shed light on hu¬ 
man social structure, and to impact the strategies used to promote behaviors or products. While the interest 
in the subject is long-standing, recent increased availability of social network and information diffusion data 
(through sites such as Facebook, Twitter, and Linkedln) has raised the prospect of applying social network 
analysis at a large scale to positive effect. Consequently, the resulting algorithmic questions have received 
widespread interest in the computer science community. 

Among the broad algorithmic domains, Influence Maximization has been repeatedly held up as having 
the potential to be of societal and financial value. The high-level hope is that based on observed data — such 
as social network information and past behavior — an algorithm could infer which individuals are likely to 
influence which others. This information could in turn be used to effect desired behavior, such as refraining 
from smoking, using superior crops, or purchasing a product. In the latter case, the goal of effecting desired 
behavior is usually termed viral marketing. 

Consequently, both the problem of inferring the influence between individuals mmmmm and that 
of maximizing the spread of a desired behavior have been studied extensively. For the Influence Maximization 
problem, a large number of models have been proposed, along with many heuristics with and without 
approximation guarantees [3 HI [H [13 U3 [201 [53 [23 HI] . (See the monograph [3 for a recent overview of 
work in the area.) 

However, one crucial aspect of the problem has — with very few exceptions discussed in Section |l.6| 
gone largely unstudied. Contrary to many other algorithmic domains, noise in social network data is not 
an exception, but the norm. Indeed, one could argue that the very notion of a “social link” is not properly 
defined in the first place, so that any representation of a social network is only an approximation of reality. 
This issue is much more pronounced for a goal such as Influence Maximization. Here, the required data 
include, for every pair (u, v) of individuals, a numerical value for the strength of influence from u to v and 
vice versa. This influence strength will naturally depend on context (e.g., what exact product or behavior 
is being spread); furthermore, it cannot be observed directly, and must therefore be inferred from observed 
behavior or individuals’ reports; all of these are inherently very noisy. 

When the inferred influence strength parameters differ from the actual ground truth, even an optimal 
algorithm is bound to return suboptimal solutions, for it will optimize the wrong objective function: a 
solution that appears good with respect to the incorrect parameters may be bad with respect to the actual 
ones. If relatively small errors in the inferred parameters could lead to highly suboptimal solutions, this 
would cast serious doubts on the practical viability of algorithmic influence maximization. Therefore, in the 
present paper, we begin an in-depth study of the effect of noise on the performance of Influence Maximization 
algorithms. 

1.1 The Independent Cascade Model 

We study this question under two widely adopted models for influence diffusion El: the Independent Cascade 
(IC) Model and the Linear Threshold (LT) Model. Both of these models fit in the following framework: The 
algorithm selects a seed set Ao of k nodes, which begin active (having adopted the behavior). Starting 
with Ao, the process proceeds in discrete time steps: in each time step, according to a probabilistic process, 
additional nodes may become active based on the influence from their neighbors. Active nodes never become 
inactive, and the process terminates when no new nodes become active in a time step. The goal is to maximize 
the expected number of active nodes when the process terminates; this expected number is denoted by cr(A 0 ). 

To illustrate the questions and approaches, we describe the IC model in this section. (A formal description 
of the LT model and general definitions of all concepts are given in Section §) Under the IC model, the 
probabilistic process is particularly simple and intuitive. When a node u becomes active in step t , it attempts 
to activate all currently inactive neighbors in step t + 1. For each neighbor v, it succeeds with a known 
probability p u ,v If it succeeds, v becomes active; otherwise, v remains inactive. Once u has made all these 
attempts, it does not get to make further activation attempts at later times. It was shown in m that 
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the set of nodes active at the end can be characterized alternatively as follows: for each ordered pair (u, v) 
independently, insert the directed edge (u,v) with probability p u<v . Then, the active nodes are exactly the 
ones reachable via directed paths from Aq. 

1.2 Can Instability Occur? 

Suppose that we have inferred all parameters p uv , but are concerned that they may be slightly off: in reality, 
the influence probabilities are p' u v w p u<v . Are there instances in which a seed set A 0 that is very influential 
with respect to the p UiV may be much less influential with respect to the p' u „? It is natural to suspect that 
this might not occur: when the objective function a varies sufficiently smoothly with the input parameters 
(e.g., for linear objectives), small changes in the parameters only lead to small changes in the objective value; 
therefore, optimizing with respect to a perturbed input still leads to a near-optimal solution. 

However, the objective a of Influence Maximization does not depend on the parameters in a smooth way. 
To illustrate the issues at play, consider the following instance of the IC model. The social network consists 
of two disjoint bidirected cliques K n . and p UiV = p for all u. v in the same clique; in other words, for each 
directed edge, the same activation probability p is observed. The algorithm gets to select exactly k = 1 node. 
Notice that because all nodes look the same, any algorithm essentially chooses an arbitrary node, which may 
as well be from Clique 1. 

Let p = 1/n be the sharp threshold for the emergence of a giant component in the Erdos-Renyi Random 
Graph G(n,p). It is well known [U[TD] that the largest connected component of G(n,p) has size O(logn) for 
any P < p — f2(l/n), and size f2(n) for any p > p + 12(1 /n). Thus, if unbeknownst to the algorithm, all true 
activation probabilities in Clique 1 are p < p — f2(l/n), while all true activation probabilities in Clique 2 
are p > p + 12(1 /n), the algorithm only activates O(logn) nodes in expectation, while it could have reached 
fl(n) nodes by choosing Clique 2. Hence, small adversarial perturbations to the input parameters can lead 
to highly suboptimal solutions from any algorithm^ 

1.3 Diagnosing Instability 

The example of two cliques shows that there exist unstable instances, in which an optimal solution to the 
observed parameters is highly suboptimal when the observed parameters are slightly perturbed compared 
to the true parameters. Of course, not every instance of Influence Maximization is unstable: for instance, 
when the probability p in the Two-Clique instance is bounded away from the critical threshold of G{n,p ), 
the objective function varies much more smoothly with p. This motivates the following algorithmic ques¬ 
tion, which is the main focus of our paper: Given an instance of Influence Maximization, can we diagnose 
efficiently whether it is stable or unstable? 

To make this question precise, we formulate a model of perturbations. We assume that for each edge 
(u,v), in addition to the observed activation probability p u>v , we are given an interval I u v 9 p u v of values 
that the actual probability p' u v could assume. The true values p' u v are chosen from the intervals I UiV by 
an adversary; they induce an objective function a' which the algorithm would like to maximize, while the 
observed values induce a different objective function a which the algorithm actually has access to. 

An instance (p u ,v,Iu,v)u,v is stable if ler(S') — cr'(S')| is small for all objective functions a 1 induced by 
legal probability settings, and for all seed sets S of size k. Here, “small” is defined relative to the objective 
function value er(Ag) of the optimum set. 

When |cr(S) — cr'(S)| is small compared to cr(A(j) for all sets S, a user can have confidence that his 
optimization result will provide decent performance guarantees even if his input was perturbed. The converse 

1 The example reveals a close connection between the stability of an IC instance and the question whether a uniform 
activation probability p lies close to the edge percolation threshold of the underlying graph. Characterizing the percolation 
threshold of families of graphs has been a notoriously hard problem. Successful characterizations have only been obtained for 
very few specific classes (such as d-dimensional grids m and d- regular expander graphs El)- Therefore, it is unlikely that a 
clean characterization of stable and unstable instances can be obtained. The connection to percolation also reveals that the 
instability was not an artifact of having high node degrees. By the result of Alon et al. [2], the same behavior will be obtained 
if both components are d-regular expander graphs, since such graphs also have a sharp percolation threshold. 
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is of course not necessarily true: even in unstable instances, a solution that was optimal for the observed 
input may still be very good for the true input parameters. 

1.4 Influence Difference Maximization 

Trying to determine whether there are a function a' and a set S for which lcr(S') — cr , (>S')| is large motivates 
the following optimization problem: Maximize |ct(S) — cr'(S')| over all feasible functions a' and all sets S. 
For any given set S, the objective is maximized either by making all probabilities (and thus a'(S)) as small 
as possible, or by making all probabilities (and thus </(S)) as large as possible^ We denote the resulting 
two objective functions by a~ and cr + , respectively. The following definition then captures the optimization 
goal. 

Definition 1 (Influence Difference Maximization) Given two instances with probabilities p U}V > p' u v 
for all u,v, let a and a' be their respective influence functions. Find a set S of size k maximizing S(S) := 
a(S) -cr'(S). 

In this generality, the Influence Difference Maximization problem subsumes the Influence Maximization 
problem, by setting p' uv = 0 (and thus also a 1 = 0). 

While Influence Difference Maximization subsumes Influence Maximization, whose objective function is 
monotone and submodular, the objective function of Influence Difference Maximization is in general neither. 
To see non-monotonicity, notice that 5(0) = S(V) = 0, while generally d(S) > 0 for some sets S. 

The function is also not in general submodular, a fact brought to our attention by Debmalya Mandal, 
Jean Pouget-Abadie and Yaron Singer, and in contrast to the main result claimed in a prior version of the 
present article. The following example shows non-submodularity for both the IC and LT Models. 

The graph has four nodes V = {u,v,x,y} and three edges (u,v),(v,x),(x,y). The edges (v,x) and 
(x,y) are known to have an activation probability of 1, while the edge (u,v) has an adversarially chosen 
activation probability in the interval [0,1]. With S = {zi} and T = {u, x}, we obtain that 6(S + v) — S(S) = 
|0| — \{v,x,y}\ = —3, while S(T + v) — S(T) = |0| — |{u}| = —1, which violates submodularity. 

In fact, we establish a very strong hardness result here, in the form of the following theorem, whose proof 
is given in Section [3] 

Theorem 1 Under the Independent Cascade Model, the Influence Difference Maximization objective func¬ 
tion 5(S) cannot be approximated better than n 1 ^ e for any e > 0 unless NP C ZPP. 

1.5 Experiments 

Next, we investigate how pervasive instabilities are in real data. We evaluate frequently used synthetic 
models (2D grids, random regular graphs, small-world networks, and preferential attachment graphs) and 
real-world data sets (computer science theory collaborations and retweets about the Haiti earthquake). We 
focus on the Independent Cascade Model, and vary the influence strengths over a broad range of commonly 
studied values. We consider different relative perturbation levels A, ranging from 1% to 50%. The adversary 
can thus choose the actual activation probability to lie in the interval [(1 — A )p u<v , (1 + A )p u ,v]- 

To calculate a value for the maximum possible Influence Difference, we use the random greedy algorithm 
of Buchbinder et al. (BJ. This choice of algorithm was motivated by the false belief that the objective function 
is submodular, in which case the algorithm would have provided a 1/e approximation. Notice, however, that 
the algorithm can only underestimate the maximum possible objective function value. Thus, when the 
Random Greedy algorithm finds a set with large influence difference, it suggests that the misestimations due 
to parameter misestimates may drown out the objective value, rendering Influence Maximization outputs 
very spurious. On the other hand, when the objective value obtained by the Random Greedy algorithm is 
small, no positive guarantees can be provided. 

“This observation relies crucially on the fact that each p u ,v can independently take on any value in I u ,v If the adversary 
were constrained by the total absolute deviation or sum of squares of deviations of parameters, this would no longer be the 
case. This issue is discussed in Section [5] 
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Our experiments suggest that perturbations can have significantly different effects depending on the 
network structure and observed values. As a general rule of thumb, perturbations above 20% relative to 
the parameter values could significantly distort the optimum solution. For smaller errors (10% or less 
relative error), the values obtained by the algorithm are fairly small; however, as cautioned above, the actual 
deviations may still be large. 

Since errors above 20% should be considered quite common for estimated social network parameters, our 
results suggest that practitioners exercise care in evaluating the stability of their problem instances, and 
treat the output of Influence Maximization algorithms with a healthy dose of skepticism. 

1.6 Adversarial vs. Random Perturbations 

One may question why we choose to study adversarial instead of random perturbations. This choice is for 
three reasons: 

Theoretical: Worst-case analysis provides stronger guarantees, as it is not based on particular assumptions 
about the distribution of noise. 

Practical: Most random noise models assume independence of noise across edges. However, we believe that 
in practice, both the techniques used for inferring model parameters as well as the data sources they 
are based on may well exhibit systematic bias, i.e., the noise will not be independent. For instance, a 
particular subpopulation may systematically underreport the extent to which they seek others’ advice, 
or may have fewer visible indicators (such as posts) revealing their behavior. 

Modeling Interest: Perhaps most importantly, most natural random noise models do not add anything to 
the IC and LT models. As an illustration, consider the random noise models studied in recent work by 
Goyal, Bonchi and Lakshmanan m and Adiga et al. pQ. Goyal et al. assume that for each edge ( u , u), 
the value of p UjV is perturbed with uniformly random noise from a known interval. Adiga et al. assume 
that each edge (it, v ) that was observed to be present is actually absent with some probability e, while 
each edge that was not observed is actually present with probability e; in other words, each edge’s 
presence is independently flipped with probability e. 

The standard Independent Cascade Model subsumes both models straightforwardly. Suppose that a 
decision is to be made about whether u activates v. In the model of Goyal et al., we can first draw 
the actual (perturbed) value of p' u v from its known distribution; subsequently, u activates v with 
probability p' uv \ in total, u activates v with probability E [p' u ]. Thus, we obtain an instance of the 
IC model in which all edge probabilities p UtV are replaced by E \p' u „]. In the special case when the 
noise has mean 0, this expectation is exactly equal to p u ,vi which explains why Goyal et al. observed 
the noise to not affect the outcome at all. 

In the model of Adiga et al., we first determine whether the edge is actually present; when it was 
observed present, this happens with probability 1 — e; otherwise with probability e. Subsequently, the 
activation succeeds with probability p. (HI assumed uniform probabilities). Thus, the model is an 
instance of the IC model in which the activation probabilities on all observed edges are p{ 1 — e), while 
those on unobserved edges are pe. This reduction explains the theoretical results obtained by Adiga et 
al. 

More fundamentally, practically all “natural” random processes that independently affect edges of the 
graph can be “absorbed into” the activation probabilities themselves; as a result, random noise does 
not at all play the result of actual noise. 


2 Models and Preliminaries 

The social network is modeled by a directed graph G = (V , E) on n nodes. All parameters for non-existing 
edges are assumed to be 0. We first describe models of influence diffusion, and then models of parameter 
perturbation. 
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2.1 Influence Diffusion Models 


Most of the models for Influence Maximization have been based on the Independent Cascade Model (see 
Section and Linear Threshold Model studied in m and their generalizations. Like the Independent 
Cascade Model, the Linear Threshold Model also proceeds in discrete rounds. Each edge (u, v) is equipped 
with a weight c Uj „ £ [0,1], satisfying ^ u c UiV < 1 for all nodes v. (By u — > v, we denote that there is a 
directed edge (u,v).) Each node v initially draws a threshold tf v independently and uniformly at random 
from [0,1]. A set A 0 of nodes is activated at time 0, and we use A t to denote the set of nodes active at time 
t. In each discrete round t, each node v checks if ^2 u eA t -1 u-tv Cu , v — Vv If so, v becomes active at time t, 
and remains active subsequently. 

Any instance of the Influence Maximization problem is characterized by its parameters. For the LT model, 
the parameters are the n 2 edge weights c u>v for all edges (u,v). Similarly, for the IC model, the parameters 
are the edge activation probabilities p u>v for all edges (u,v). To unify notation, we write 9 = (9 U , V ) (u,v)&E 
for the vector of all parameter values, where 0 U>V could be either c U;V or p u , v - 

Both the IC and LT model define random processes that continue until the diffusion process quiesces, 
i.e., no new activations occur. Let r < n be the (random) time at which this happens. It is clear that r < n 
always, since at least one more node becomes active in each round. We denote the stochastic process by 
P“ od (io) = (A t )t—o, with Mod £ {IC,LT} denoting the model. The final set of active nodes is A r . We can 
now formally define the Influence Maximization problem: 

Definition 2 (Influence Maximization) The Influence Maximization problem consists of maximizing the 
objective (t(Aq) := E[|A r |] (i.e., the expected number of active nodes in the erwQ), subject to a cardinality 
constraint |Ao| < k. 

The key insight behind most prior work on algorithmic Influence Maximization is that the objective 
function er(S') is a monotone and submodular function of S. This was proved for the IC and LT models in 
PZ1. and subsequently for a generalization called Generalized Threshold Model (proposed in [17]) by Mossel 
and Roch 1221 . 

2.2 Models for Perturbations 

To model adversarial input perturbations, we assume that for each of the edges (u,v), we are given an 
interval / Uj „ = [f u , v ,r u>v ] C [0,1] with 9 U}V £ For the Linear Threshold Model, to ensure that the 

resulting activation functions are always submodular, we require that r UjV < 1 for all nodes v. We 

write © = X ( u ,v)eElu,v for the set of all allowable parameter settings. The adversary must guarantee that 
the ground truth parameter values satisfy 6' £ ©; subject to this requirement, the adversary can choose the 
actual parameter values arbitrarily. 

Together, the parameter values 0 determine an instance of the Influence Maximization problem. We will 
usually be explicit about indicating the dependence of the objective function on the parameter setting. We 
write ag for the objective function obtained with parameter values 6 , and only omit the parameters when 
they are clear from the context. For a given setting of parameters, we will denote by A* e £ argmax s <Jo{S) 
a solution maximizing the expected influence under parameter values 6. 

2.3 Influence Difference Maximization 

In order to capture to what extent adversarial changes in the parameters can lead to misestimates of any 
set’s influence, we are interested in the quantity 

max max |crn(5) — crfl'(5)|, (1) 

S 0'e© 

3 Our results carry over unchanged if we assign each node a non-negative value r v , and the goal is to maximize a r v . 
We focus on the case of uniform values for notational convenience only. 
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where 9 denotes the observed parameter values. For two parameter settings 9 , 9' with 9 > 9' coordinate- 
wise, it is not difficult to show using a simple coupling argument that er g(S) > crofiS) for all S. Therefore, 
for any fixed set S , the maximum is attained either by making 9' as large as possible or as small as possible. 
Hence, solving the following problem is sufficient to maximize (Ill- 

Definition 3 Given an influence model and two parameter settings 9 , 9' with 9 > 9' coordinate-wise, 

5e,e'(S) = a e (S)-aeflS). 

Given the set size k, the Influence Difference Maximization (IDM) problem is defined as follows: 

Maximize SggfiS) 
subject to 151 = k. 

3 Approximation hardness 

In this section, we prove Theorem [lj 

Proof of Theorem We establish the approximation hardness of Influence Difference Maximization 
without any constraint on the cardinality of the seed set Aq. From this version, the hardness of the con¬ 
strained problem is inferred easily as follows: if any better approximation could be obtained for the con¬ 
strained problem, one could simply enumerate over all possible values of k from 1 to n, and retain the best 
solution, which would yield the same approximation guarantee for the unconstrained problem. 

We give an approximation-preserving reduction from the Maximum Independent Set problem to the 
Influence Difference Maximization problem. It is well known that Maximum Independent Set cannot be 
approximated better than 0(n 1_£ ) for any e > 0 unless NP C ZPP [IB], 

Let G = (V, E) be an instance of the Maximum Independent Set problem, with |P| = n. We construct 
from G a directed bipartite graph G' with vertex set V' U V". For each node Vi £ V, there are nodes v[ £ V' 
and v'l £ V". The edge set is E' U E" , where E' = {0',u") | £ E}, and E" = {(u',<) | £ V}. 

All edges of E' are known to have an activation probability of 1, while all edges of E" have an activation 
probability from the interval [0,1]. 

The difference is maximized by making all probabilities as large for one function (meaning that all edges 
in E' U E" are present deterministically), while making them as small as possible for the other (meaning 
that exactly the edges in E' are present). 

First, let S be an independent set in G. Consider the set S' = {u' | Vi £ S}. Each node v” with Vi £ S is 
reachable from the corresponding v\ in G', but not in (V U V",E'), because S is independent. Hence, the 
objective function value obtained in Influence Difference Maximization is at least |Sj. 

Conversely, consider an optimal solution S' to the Influence Difference Maximization problem. Without 
loss of generality, we may assume that S' C V': any node v" £ V" can be removed from S' without lowering 
the objective value. Assume that S := {vi £ V \ v[ £ S'} is not independent, and that ( Vi,Vj ) £ E for 
Vi,Vj £ S. Then, removing v' rj from S' cannot lower the Influence Difference Maximization objective value 
of S': all of v'j’s neighbors in V" contribute 0, as they are reachable using E' already; furthermore, v” also 
does not contribute, as it is reachable using E' from v[. Thus, any node with a neighbor in S can be removed 
from S' i meaning that S is without loss of generality independent in G. 

At this point, all the neighbors of S' contribute 0 to the Influence Difference Maximization objective 
function (because they are reachable under E' already), and the objective value of S' is exactly S' = |S|. 


define 

( 2 ) 

(3) 


4 Experiments 

While we saw in Section |1.2| that examples highly susceptible (with errors of magnitude D(n)) to small 
perturbations exist, the goal of this section is to evaluate experimentally how widespread this behavior is for 
realistic social networks. 
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4.1 Experimental Setting 

We carry out experiments under the Independent Cascade Model, for six classes of graphs — four synthetic 
and two real-world. In each case, the model/data give us a simple graph or multigraph. Multigraphs are 
converted to simple graphs by collapsing parallel edges to a single edge with weight c e equal to the number of 
parallel edges; for simple graphs, all weights are c e = 1. The observed probabilities for edges are p e = c e ■ p; 
across experiments, we vary the base probability p to take on the values {0.01,0.02, 0.05, 0.1}. The resulting 
parameter vector is denoted by 0. 

The uncertainty interval for e is I e = [(1 — A )p e , (1 + A )p e ]', here, A is an uncertainty parameter for the 
estimation, which takes on the values {1%, 5%, 10%, 20%, 50%} in our experiments. The parameter vectors 
0+ and 6 describe the settings in which all parameters are as large (as small, respectively) as possible. 

4.2 Network Data 

We run experiments on four synthetic networks and two real social networks. Synthetic networks provide a 
controlled environment in which to compare observed behavior to expectations, while real social networks 
may give us indications about the prevalence of vulnerability to perturbations in real networks that have 
been studied in the past. 

Synthetic Networks. We generate synthetic networks according to four widely used network models. 
In all cases, we generate undirected networks with 400 nodes. The network models are: (1) the 2-dimensional 
grid, (2) random regular graphs, (3) the Watts-Strogatz Small-World (SW) Model [37] on a ring with each 
node connecting to the 5 closest nodes on each side initially, and a rewiring probability of 0.1. (4) The 
Barabasi-Albert Preferential Attachment (PA) Model |3J with 5 outgoing edges per node. For all synthetic 
networks, we select k = 20 seed nodes. 

Real Networks. We consider two real networks to evaluate the susceptibility of practical networks: one 
( STOCFOCS ) is a co-authorship network of theoretical CS papers; the other (Haiti) is a Retweet network. 

The co-authorship network, STOCFOCS , is a multigraph extracted from published papers in the con¬ 
ferences STOC and FOCS from 1964-2001. Each node in the network is a researcher with at least one 
publication in one of the conferences. For each multi-author paper, we add a complete undirected graph 
among the authors. As mentioned above, parallel edges are then compressed into a single edge with corre¬ 
sponding weight. The resulting graph has 1768 nodes and 10024 edges. Due to its larger size, we select 50 
seed nodes. 

The Haiti network is extracted from tweets of 274 users on the topic Haiti Earthquake in Twitter. For each 
tweet of user u that was retweeted by v, we add a directed edge (u,v). We obtain a directed multigraph; 
after contracting parallel edges, the directed graph has 383 weighted edges. For this network, due to its 
smaller size, we select 20 seeds. 

In all experiments, we work with uniform edge weights p , since — apart from edge multiplicities — we 
have no evidence on the strength of connections. It is a promising direction for future in-depth experiments 
to use influence strengths inferred from real-world cascade datasets by network inference methods such as 

mmm- 

4.3 Algorithms 

Our experiments necessitate the solution of two algorithmic problems: Finding a set of size k of maximum 
influence, and finding a set of size k maximizing the influence difference. The former is a well-studied problem, 
with a monotone submodular objective function. We simply use the widely known 1 — 1/e approximation 
algorithm due to Nemhauser et al. [24] . which is best possible unless P=NP. 

For the goal of Influence Difference Maximization, we established (in Section[3]) that the objective function 
is hard to approximate better than a factor 0(n 1-e ) for any e > 0. For experimental purposes, we use the 
Random Greedy algorithm of Buchbinder et al. [B], given as Algorithm [l] below. It is a natural generalization 
of the simple greedy algorithm of Nemhauser et al.: Instead of picking the best single element to add in each 
iteration, it first finds the set of the k individually best single elements (i.e., the elements which when added 


to the current set give the largest, second-largest, third-largest, ..., fc th -largest gain). Then, it picks one of 
these k elements uniformly at random and continues. 

This particular choice of algorithm was motivated by an incorrect claim included in a prior version of 
this work, namely, that the Influence Difference Maximization objective is (non-monotone) submodular. 
For such functions, the Random Greedy algorithm guarantees at least an 0.266-approximation, and the 
guarantee improves to nearly 1/e when k <C n. Furthermore, the Random Greedy algorithm is simpler and 
more efficient than other algorithms with slightly superior approximation guarantees. We stress that these 
guarantees are not obtained for our objective function, as submodularity does not hold. 


Algorithm 1 Random Greedy Algorithm 
1 : Initialize: So <— 0 
2 : for i = 1 ,..., k do 

3: Let Mi C V \ Si -1 be the subset of size k maximizing 9(Si-i U {u}) — g(Si-i). 

4: Draw Ui uniformly at random from Mi. 

5: Let Si i — S.j_i U { 7 / 2 }. 

6: end for 

7: Return Sk 


The running time of the Random Greedy Algorithm is 0(kC\V\), where C is the time required to estimate 
g(S U {it}) — g(S). In our case, the objective function is #P-hard to evaluate exactly {25J [5], but arbitrarily 
close approximations can be obtained by Monte Carlo simulation. Since each simulation takes time 0(|V|), 
if we run M = 2000 iterations of the Monte Carlo simulation in each iteration, the overall running time of 
the algorithm is 0(kM\V \ 2 ). 

A common technique for speeding up the greedy algorithm for maximizing a submodular function is the 
CELF heuristic of Leskovec et al. [21]. When the objective function is submodular, the standard greedy 
algorithm and CELF obtain the same result. However, when it is not, the results may be different. In 
the previous version of this article, we had used the CELF heuristic due to the incorrect belief that the 
objective function was submodular. In this revised version, we instead report the results from rerunning 
all the experiments without the use of the CELF heuristic. The single exception is the largest input, the 
STOCFOCS network. (Here, the greedy algorithm without CELF did not finish in a reasonable amount 
of time.) For all networks other than STOCFOCS, the results using CELF are not significantly different 
from the reported results without the CELF optimization. For STOCFOCS, we instead report the result 
including the CELF heuristic. 

4.4 Results 

In all our experiments, the results for the Grid and Small-World network are sufficiently similar that we 
omit the results for grids here. As a first sanity check, we empirically computed maxs.|s| =1 S e + g ~ (S) for 
the complete graph on 200 nodes with I e = [1/200 • (1 — A), 1/200 • (1 + A)] and k = 1. According to the 
analysis in Section |l.2[ we would expect extremely high instability. The results, shown in Table [TJ confirm 
this expectation. 


A 

a e+ 

a g - 

50% 

66.529 

1.955 

20% 

23.961 

4.253 

10% 

15.071 

6.204 


Table 1: Instability for the clique I\ 2 oo- 

Next, Figure [l] shows the (approximately) computed values max 5 .|g| =fc 8 e + e - (S), and — for calibration 
purposes — max j 4 t) .| j 4 0 | =fe cto(Aq) for all networks and parameter settings. Notice that the result is obtained 
by running the Random Greedy algorithm without any approximation guarantee. However, as the algorithm’s 
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(a) Small World (b) PA (c) STOCFOCS 



(d) Haiti 


Figure 1: Comparison between Influence Difference Maximization and Influence Maximization 
results for four different networks. (The result of STOCFOCS network is obtained with CELF 
optimization.) 



Figure 2: Ratio between the computed values of Influence Difference Maximization and Influ¬ 
ence Maximization under random regular graphs with different degree. 


output provides a lower bound on the maximum influence difference, a large value suggests that Influence 
Maximization could be unstable. On the other hand, small values do not guarantee that the instance is 
stable, as the algorithm provides no approximation guarantee. 

While individual networks vary somewhat in their susceptibility, the overall trend is that larger estimates 
of baseline probabilities p make the instance more susceptible to noise, as do (obviously) larger uncertainty 
parameters A. In particular, for A > 20%, the noise (after scaling) dominates the Influence Maximization 
objective function value, meaning that optimization results should be used with care. 

Next, we evaluate the dependence of the noise tolerance on the degrees of the graph, by experimenting 
with random d-regular graphs whose degrees vary from 5 to 25. It is known that such graphs are expanders 
with high probability, and hence have percolation thresholds of 1/d [2]. Accordingly, we set the base prob¬ 
ability to (1 + a)/d with a G {—20%, 0,20%}. We use the same setting for uncertainty intervals as in the 
previous experiments. Figure [2^shows the ratio between Influence Difference Maximization and Influence 

Maximization, i.e., ——, with a € {—20%, 0, 20%}. It indicates that for random regular graphs, 
the degree does not appear to significantly affect stability, and that again, noise around 20% begins to pose 
a significant challenge. Moreover, we observe that the ratio reaches its minimum when the edge activation 
probability is exactly at the percolation threshold 1 /d. This result is in line with percolation theory and also 
the analysis of Adiga et al. pQ. 

As a general takeaway message, for larger amounts of noise (even just a relative error of 20%) — which may 
well occur in practice — a lot of caution is advised in using the results of algorithmic Influence Maximization. 
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5 Discussion 


We began a study of the stability of Influence Maximization when the input data are adversarially noisy. 
We showed that estimating the susceptibility of an instance to perturbations can be cast as an Influence 
Difference Maximization problem. Unfortunately, the Influence Difference Maximization problem under the 
Independent Cascade Model is as hard to approximate as the Independent Set problem. While we do not 
at present have a comparable approximation hardness result for the Linear Threshold Model, we consider it 
unlikely that the Influence Difference Maximization objective could be much better approximated for that 
model. 

We used the Random Greedy algorithm of Buclrbinder et al. to gain an empirical understanding of the 
prevalence of instability on several synthetic and real networks. The results suggest that 20% relative error 
could lead to a significant risk of suboptimal outputs. Given the noise inherent in all estimates of social 
network data, this suggests applying extreme caution before relying heavily on results of algorithmic Influence 
Maximization. 

The fact that our main theorem is negative (i.e., a strong approximation hardness result) is somewhat 
disappointing, in that it rules out reliably categorizing data sets as stable or unstable. This suggests searching 
for models which remain algorithmically tractable while capturing some notion of adversarially perturbed 
inputs. The issue of noise in social network data will not disappear, and it is necessary to understand its 
impact more fundamentally. 

While we begin an investigation of how pervasive susceptibility to perturbations is in Influence Maxi¬ 
mization data sets, our investigation is necessarily limited. Ground truth data are by definition impossible 
to obtain, and even good and reliable inferred data sets of actual influence probabilities are currently not 
available. The values we assigned for our experimental evaluation cover a wide range of parameter values 
studied in past work, but the community does not appear to have answered the question whether these 
ranges actually correspond to reality. 

At an even more fundamental level, the models themselves have received surprisingly little thorough 
experimental validation, despite having served as models of choice for hundreds of papers over the last decade. 
In addition to verifying the susceptibility of models to parameter perturbations, it is thus a pressing task to 
verify how susceptible the optimization problems are to incorrect models. The verification or falsification of 
sociological models for collective behavior likely falls outside the expertise of the computer science community, 
but nonetheless needs to be undertaken before any significant impact of work on Influence Maximization can 
be truthfully claimed. 
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